Discriminative Training and Variational Decoding in Machine Translation via Novel Algorithms for Weighted Hypergraphs

نویسنده

  • Zhifei Li
چکیده

A hypergraph or “packed forest” is a compact data structure that uses structure-sharing to represent exponentially many trees in polynomial space. A probabilistic/weighted hypergraph also defines a probability (or other weight) for each tree, and can be used to represent the hypothesis space considered (for a given input) by a monolingual parser or a tree-based translation system (e.g., tree to string, string to tree, tree to tree, or string to string with latent tree structures). Given a weighted/probabilistic hypergraph, we might ask three questions. What atomic operations can we perform on the weighted hypergraph? How do we set the weights in the hypergraph? Which particular translation (among the possible translations encoded in a hypergraph) should we present to an end user? These correspond to three fundamental problems: inference, training, and decoding, for which this dissertation will present novel techniques. The atomic inference operations we may want to perform include finding one-best, kbest, or expectations over the hypergraph. To perform each operation, we may implement a dedicated dynamic programming algorithm. However, a more general framework to specify these algorithms is semiring-weighted logic programming. Within this framework, we first extend the expectation semiring, which is originally proposed for a finite state automaton, to a hypergraph. We then propose a novel second-order expectation semiring. These semirings can be used to compute a large number of expectations (e.g., entropy and its gradient) over the exponentially many trees presented in a hypergraph. The weights used in a hypergraph are usually learnt by a discriminative training method. One common drawback of such method is that it relies on the existence of high-quality supervised data (i.e., bilingual data), which may be expensive to obtain. We present two unsupervised discriminative training methods: minimum imputed-risk training, and contrastive language model estimation, both can exploit monolingual English data to perform discriminative training. In minimum imputed-risk training, we first use a reverse translation model to impute the missing inputs, and then train a discriminative forward model by minimizing the expected loss of the forward translations of the missing inputs. In contrast, the contrastive language model estimation does not use a reverse system. It first extracts a confusion grammar, then generates many alternative sentences (i.e., a contrastive set) for each English sentence using the confusion grammar, and finally trains a discriminative language model on the contrastive sets such that the model will prefer the

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Efficient Minimum Error Rate Training and Minimum Bayes-Risk Decoding for Translation Hypergraphs and Lattices

Minimum Error Rate Training (MERT) and Minimum Bayes-Risk (MBR) decoding are used in most current state-of-theart Statistical Machine Translation (SMT) systems. The algorithms were originally developed to work with N -best lists of translations, and recently extended to lattices that encode many more hypotheses than typical N -best lists. We here extend lattice-based MERT and MBR algorithms to ...

متن کامل

Forest Reranking for Machine Translation with the Perceptron Algorithm

We present a scalable discriminative training framework for parsing-based statistical machine translation. Our framework exploits hypergraphs (or packed forests) to compactly encode exponentially many competing translations, and uses the perceptron algorithm to learn to discriminatively prefer the oracle-best tree in the hypergraph. To facilitate training, we present: (i) an oracle extraction a...

متن کامل

Joshua 2.0: A Toolkit for Parsing-Based Machine Translation with Syntax, Semirings, Discriminative Training and Other Goodies

We describe the progress we have made in the past year on Joshua (Li et al., 2009a), an open source toolkit for parsing based machine translation. The new functionality includes: support for translation grammars with a rich set of syntactic nonterminals, the ability for external modules to posit constraints on how spans in the input sentence should be translated, lattice parsing for dealing wit...

متن کامل

Search Based Weighted Multi-Bit Flipping Algorithm for High-Performance Low-Complexity Decoding of LDPC Codes

In this paper, two new hybrid algorithms are proposed for decoding Low Density Parity Check (LDPC) codes. Original version of the proposed algorithms named Search Based Weighted Multi Bit Flipping (SWMBF). The main idea of these algorithms is flipping variable multi bits in each iteration, change in which leads to the syndrome vector with least hamming weight. To achieve this, the proposed algo...

متن کامل

Search Based Weighted Multi-Bit Flipping Algorithm for High-Performance Low-Complexity Decoding of LDPC Codes

In this paper, two new hybrid algorithms are proposed for decoding Low Density Parity Check (LDPC) codes. Original version of the proposed algorithms named Search Based Weighted Multi Bit Flipping (SWMBF). The main idea of these algorithms is flipping variable multi bits in each iteration, change in which leads to the syndrome vector with least hamming weight. To achieve this, the proposed algo...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010